CNN models for readability of Chinese texts

نویسندگان

چکیده

<p style='text-indent:20px;'>Readability of Chinese texts considered in this paper is a multi-class classification problem with <inline-formula><tex-math id="M1">\begin{document}$ 12 $\end{document}</tex-math></inline-formula> grade classes corresponding to id="M2">\begin{document}$ 6 grades primary schools, id="M3">\begin{document}$ 3 middle and id="M4">\begin{document}$ high schools. A special property the strong ambiguity determining grades. To overcome difficulty, measurement readability assessment methods used empirically practice adjacent accuracy addition exact accuracy. In we give mathematical definitions these concepts learning theory framework compare two quantities terms level texts. deep algorithm proposed for texts, based on convolutional neural networks pre-trained BERT model vector representations characters. The CNN can extract sentence text features by convolutions filters efficient assessment, which demonstrated some numerical experiments.</p>

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sorting Texts by Readability

This article presents a novel approach for readability assessment through sorting. A comparator that judges the relative readability between two texts is generated through machine learning, and a given set of texts is sorted by this comparator. Our proposal is advantageous because it solves the problem of a lack of training data, because the construction of the comparator only requires training...

متن کامل

On The Applicability of Readability Models to Web Texts

An increasing range of features is being used for automatic readability classification. The impact of the features typically is evaluated using reference corpora containing graded reading material. But how do the readability models and the features they are based on perform on real-world web texts? In this paper, we want to take a step towards understanding this aspect on the basis of a broad r...

متن کامل

Readability Assessment of Translated Texts

In this paper we investigate how readability varies between texts originally written in English and texts translated into English. For quantification, we analyze several factors that are relevant in assessing readability – shallow, lexical and morpho-syntactic features – and we employ the widely used Flesch-Kincaid formula to measure the variation of the readability level between original Engli...

متن کامل

A multivariate model for classifying texts' readability

We report on results from using the multivariate readability model SVIT to classify texts into various levels. We investigate how the language features integrated in the SVIT model can be transformed to values on known criteria like vocabulary, grammatical fluency and propositional knowledge. Such text criteria, sensitive to content, readability and genre in combination with the profile of a st...

متن کامل

Measuring Readability of Polish Texts: Baseline Experiments

Measuring readability of a text is the first sensible step to its simplification. In this paper we present an overview of the most common approaches to automatic measuring of readability. Of the described ones, we implemented and evaluated: Gunning FOG index, Flesch-based Pisarek method. We also present two other approaches. The first one is based on measuring distributional lexical similarity ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical foundations of computing

سال: 2022

ISSN: ['2577-8838']

DOI: https://doi.org/10.3934/mfc.2022021